Principles of data visualization

MACS 40700 University of Chicago

Basic data structures

  • Data type
  • Dataset type

Data types

Source: Visualization Analysis and Design. Tamara Munzner, with illustrations by Eamonn Maguire. A K Peters Visualization Series, CRC Press, 2014.

Dataset types

Source: Visualization Analysis and Design. Tamara Munzner, with illustrations by Eamonn Maguire. A K Peters Visualization Series, CRC Press, 2014.

Source: Visualization Analysis and Design. Tamara Munzner, with illustrations by Eamonn Maguire. A K Peters Visualization Series, CRC Press, 2014.

Tables

  • Flat table
    • Each row is an item
    • Each column is an attribute
    • Each cell is a value fully specified by the combination of row and column
  • Multidimensional table

Networks

A small example network with eight vertices and ten edges. Source: Wikipedia

Trees

Organization, mission, and functions manual: Civil Rights Division. Source: U.S. Department of Justice

Fields

Source: NASA Earth Observatory

Geometry

  • Shape of items with explicit spatial positions
  • 0D
  • 1D
  • 2D
  • 3D
  • Maps

Attribute types

Source: Visualization Analysis and Design. Tamara Munzner, with illustrations by Eamonn Maguire. A K Peters Visualization Series, CRC Press, 2014.

Semantics

  • Type vs. semantic
  • Key vs. value

Anscombe’s quartet

Dataset 1
\(N\) \(\bar{X}\) \(\bar{Y}\) \(R^2\)
11 9 7.500909 0.8164205
Dataset 2
\(N\) \(\bar{X}\) \(\bar{Y}\) \(R^2\)
11 9 7.500909 0.8162365
Dataset 3
\(N\) \(\bar{X}\) \(\bar{Y}\) \(R^2\)
11 9 7.5 0.8162867
Dataset 4
\(N\) \(\bar{X}\) \(\bar{Y}\) \(R^2\)
11 9 7.500909 0.8165214

Anscombe’s quartet

Dataset 1
Term Estimate Standard Error \(T\)-statistic p-value
(Intercept) 3.0000909 1.1247468 2.667348 0.0257341
x 0.5000909 0.1179055 4.241455 0.0021696
Dataset 2
Term Estimate Standard Error \(T\)-statistic p-value
(Intercept) 3.000909 1.1253024 2.666758 0.0257589
x 0.500000 0.1179637 4.238590 0.0021788
Dataset 3
Term Estimate Standard Error \(T\)-statistic p-value
(Intercept) 3.0024545 1.1244812 2.670080 0.0256191
x 0.4997273 0.1178777 4.239372 0.0021763
Dataset 4
Term Estimate Standard Error \(T\)-statistic p-value
(Intercept) 3.0017273 1.1239211 2.670763 0.0255904
x 0.4999091 0.1178189 4.243028 0.0021646

Anscombe’s quartet

Marks

Source: Visualization Analysis and Design. Tamara Munzner, with illustrations by Eamonn Maguire. A K Peters Visualization Series, CRC Press, 2014.

Channels

Source: Visualization Analysis and Design. Tamara Munzner, with illustrations by Eamonn Maguire. A K Peters Visualization Series, CRC Press, 2014.

How to pick an appropriate graph

  1. Think about the task or tasks you want to enable
  2. Try different graphic forms
  3. Arrange the components of the graphic
  4. Test the outcomes

What is the story?

Source: The Truthful Art: Data, charts, and maps for communication. Alberto Cairo. New Riders, 2016.

Basic charts

  • What is the purpose?
  • Comparisons
  • Proportions
  • Relationships
  • Location
  • Distribution
  • Patterns

Histogram

Density plot

Box-and-whisker plot

Box-and-whisker plot

Bar chart

Grouped bar chart

Box plot

Violin plot

Scatterplot

Scatterplot

Line graph

Line graph

Grouped line charts

Network diagram

Network diagram

Heatmap

Heatmap

Stacked bar chart

Bubble chart

The Wealth & Health of Nations

Proportional area chart

Pie chart

Donut chart